Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hard label classification #635

Merged
merged 2 commits into from
Sep 11, 2023
Merged

hard label classification #635

merged 2 commits into from
Sep 11, 2023

Conversation

cogeid
Copy link
Contributor

@cogeid cogeid commented Apr 13, 2022

What does this PR do?

the previous pull request was accidentally closed and could not be reopened due to the original repo being deleted, so I created a new one with the same chagnes

This PR adds a new Goal Function, called "HardLabelClassification", which finds the maximum semantic similarity between two pieces of text such that the generated text is outside of the target model's decision boundary.

Summary

This PR adds the Hard Label Classification goal function, which finds the maximum semantic similarity between two pieces of text such that the generated text is outside of the target model's decision boundary. Below is an example use case, where the user would be able to specify "goal-function hard-label-classification". The implementation for the goal function is based on the paper as well as the corresponding implementation, but only the goal function is being implemented as part of TextAttack.

hardlabeluse

Additions

  • Added a new Goal Function file in the classification folder called "hardlabel_classification.py"
  • Specified the new objective function for the Goal Function in the file

Changes

  • The "hardlabel-classification" attack argument was created for hard label attacks.

Deletions

  • There were no deletions made for this PR.

Checklist

  • The title of your pull request should be a summary of its contribution.
  • Please write detailed description of what parts have been newly added and what parts have been modified. Please also explain why certain changes were made.
  • If your pull request addresses an issue, please mention the issue number in the pull request description to make sure they are linked (and people consulting the issue know you are working on it)
  • To indicate a work in progress please mark it as a draft on Github.
  • [ ] Make sure existing tests pass.
  • [ ] Add relevant tests. No quality testing = no merge.
  • [ ] All public methods must have informative docstrings that work nicely with sphinx. For new modules/files, please add/modify the appropriate .rst file in TextAttack/docs/apidoc.'

Copy link
Collaborator

@jxmorris12 jxmorris12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cogeid can you run the formatter (make format) and push? The recipe looks great. Once the checks show up, we can merge.

@@ -0,0 +1,39 @@
"""
Determine if an attack has been successful in Hard Label Classficiation.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: Classification

@jxmorris12
Copy link
Collaborator

@cogeid - everything should work once you format the code, fix the typo, and push again! Please let me know if you're interested in finishing this

@qiyanjun qiyanjun merged commit f848247 into master Sep 11, 2023
3 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants